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Abstract 

We identify the presence of Pet-Fish problem situations and the corresponding Guppy effect of concept 
theory on the World-Wide Web. For this purpose, we introduce absolute weights for words expressing 
concepts and relative weights between words expressing concepts, and the notion of 'meaning bound' 
between two words expressing concepts, making explicit use of the conceptual structure of the World- 
Wide Web. The Pet-Fish problem occurs whenever there are exemplars - in the case of Pet and Fish 
these can be Guppy or Goldfish - for which the meaning bound with respect to the conjunction is stronger 
than the meaning bounds with respect to the individual concepts. 



1 Introduction 



In Aerts & Gabora (2005a, b), we introduced a modeling scheme for concepts and their combinations that 
makes use of the mathematical formalism of quantum physics. This quantum modeling scheme has been 
further worked out in Aerts (2009a) and Aerts (2010a, b). The experimental data we used to create our 
modeling scheme were data collected in experiments with human subjects that were conducted within the 
framework of concepts research in psychology (Hampton 1988a, b). These experiments required human 
subjects to estimate typicalities of exemplars of concepts and their combinations. The results of these 
estimations were in conflict with how combinations of concepts such as 'conjunction' and 'disjunction' 
were expected to behave classically, as prescribed by classical logic or set theory. Hampton called these 
deviations from classical behavior 'overextension' and 'underextension', depending on their relation to the 
classically expected values of typicality (Hampton 1988a,b). 
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Hampton's experiments, and his search for over and underextended behavior of concepts and their 
conjunctions and disjunctions, were inspired by the weh-known Pet-Fish problem in concept theory, the 
first discussion of such a deviation from classicahty for the conjunction of concepts. Osherson and Smith 
(1981) considered the conceptual combination of Pet and Fish into the conjunction Pet-Fish, and measured 
the typicality of different exemplars with respect to the concepts Pet, Fish and their conjunction Pet-Fish. 
They observed that exemplars such as Guppy and Goldfish give rise to a typicality with respect to the 
conjunction Pet-Fish which is much bigger than would be expected if the conjunction of Pet and Fish were 
treated from the perspective of classical logic and set theory. Indeed, from a classical perspective, i.e. when 
modeled by fuzzy set theory applying the maximum rule for conjunction, the typicality of an exemplar 
with respect to the conjunction of two concepts should not exceed the maximum of the typicalities of this 
exemplar with respect to both concepts apart. The effect present in the Pet-Fish problem is often referred 
to as the Guppy effect. 

In Aerts and Gabora (2005a, b), we analyzed the Pet-Fish problem in detail, showing how, within the 
quantum modeling scheme we proposed, the experimental data violating the classically expected values 
could be modeled correctly. The aim of the present article is to show how the phenomenon underlying the 
Pet-Fish problem can be identified on the World-Wide Web. More specifically, we can show that the same 
kind of violation of classicahty put forward in the Pet-Fish problem by Osherson and Smith (1981) also 
occurs when conceptual data are gathered on the World-Wide Web in a specific way. We will explain this 
below. This way of extracting conceptual structure from the World-Wide Web was put forward recently 
by one of the authors as part of the elaboration of a new interpretation of quantum mechanics analyzing 
the violation of Bell inequalities (Aerts 2009b, 2010c). It is currently being elaborated into a quantum 
mechanical model for the World-Wide Web (Aerts 2010e,f). The present article will show that the technique 
put forward in Aerts (2009b) and Aerts (2010c,e,f) can also be used to identify the presence of Pet-Fish 
problem situations on the World-Wide Web. 

In recent years, the quantum modeling of situations in cognition and in other domains different from 
the micro-world has become the focus of research of several scientists working in the newly emerging field 
of research called 'Quantum Interaction'. This has already led to the organization of three successful 
international workshops (Bruza et al. 2007, 2008, 2009). Several effects observed in the field of cognition 
have been linked to the presence of quantum structure, more specifically 'the disjunction effect' (Busemeyer, 
Wang and Townsend 2006; Pothos and Busemeyer 2009), 'the conjunction fallacy' (Franco 2009), but also 
the Allais and EUsberg paradoxes in economy (Allais 1953; EUsberg 1961; Aerts and D'Hooghe 2009). In 
the present paper we confine ourselves to an analysis of the Pet-Fish problem, but in Aerts (2010e,f) we 
show that the other non-classical effects can be studied as well by collecting data on the World-Wide Web 
in a similar way as we do here for the Pet-Fish problem. To this end, we used Yahoo's search engine to 
find the numbers of webpages containing certain terms or words expressing concepts and combinations of 
concepts. The reason why we preferred Yahoo to Google to compile our statistics, is that we found Yahoo 
to produce more consistent results over time. Since these numbers change over time, we should add that 
we carried out the web searches necessary for this research on May 4, 2010. 

2 Relative Weights 

We found 1,290,000,000 webpages containing the word Pet, 1,100,000,000 webpages containing the word 
Fish, and 1,760,000 webpages containing the combination of words Pet-Fish. Our aim was to find only 
results containing the exact combination, so that we entered the search term "pet fish" . We then considered 
another word. Hierarchy - we will make clear in the following why we chose such a rather uncommon 
word like Hierarchy to start our analysis. We found 4,210,000 webpages containing both words Pet and 
Hierarchy, 6,550,000 webpages containing both words Fish and Hierarchy and 1,410 webpages containing 
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both the word expressing the conjunction concept Pet-Fish and the word Hierarchy. This means that 
if we calculate the relative weights, i.e. the number of webpages containing Pet and Hierarchy divided 
by the number of webpages containing Pet, denoting these relative weights as w {Pet, Hierarchy), and 
equally so for Fish and Hierarchy, and Pet-Fish and Hierarchy, we find that w{Pet, Hierarchy) = 0.00326, 
w {Fish, Hierarchy) = 0.00595 and w {Pet- Fish, Hierarchy) = 0.00080, respectively. This means that 



Inequalities ([T|) and ^ show that the words Pet, Fish and Pet-Fish behave classically with respect to 
the word Hierarchy, if we regard the relative weights as adequate measures for the typicality, within a 
classical fuzzy set theoretic modeling and apply the maximum rule for the conjunction. In general terms, 
we therefore define the relative weight w{A, B) of word B with respect to word A as the number of webpages 
containing both word A and word B divided by the number of webpages containing word A. 

Let us now consider the word Guppy and calculate the relative weights of this word with respect to 
Pet, Fish and Pet-Fish. We found 3,050,000 webpages containing the word Pet and the word Guppy, 
4,520,000 webpages containing the word Fish and the word Guppy, and 37,900 webpages containing the 
word Pet-Fish and the word Guppy. This gives us w{Pet, Guppy) = 0.00236, w{Fish, Guppy) = 0.00411 
and w{Pet-Fish, Guppy) = 0.02153. With respect to Guppy, this gives us a relation between the relative 
weights that is contrary to the one we have with respect to Hierarchy, namely 



In other words, this is a manifest instance of non-classical behavior of the concept expressed by the word 
Guppy with respect to the concepts expressed by the words Pet, Fish and Pet-Fish. 

3 Correcting Counts 

Before giving further examples to illustrate the frequent occurrence of the Pet-Fish problem on the World- 
Wide Web, we should briefly point out a specific problem concerning the numbers of webpages yielded 
by the Yahoo search engine - as well as by the Google search engine, for that matter. For example, 
when investigating whether there was a Guppy effect with respect to the concept World, we found the 
number of webpages containing Pet and World to be 1,030,000,000. The number of webpages containing 
Pet and 'not' containing World was 890,000,000. However, the number of webpages containing the word 
Pet was 1,290,000,000, i.e. much lower than the sum of the number of webpages containing Pet and World 
(1,030,000,000) and the number of webpages containing Pet and 'not' containing World (890,000,000). So 
the counts by Yahoo - and equally so by Google - were incorrect here. We noticed that this error occurred 
when making combinations with words that are very abundant on the World-Wide Web, such as World. 
Indeed, the number of webpages containing World was 11,100,000,000. To introduce a correction for this 
error, we proceeded as follows. We divided the number of webpages containing Pet, i.e. 1,290,000,000, by 
the the sum of the number of webpages containing Pet and World and the number of webpages containing 
Pet and 'not' containing World, i.e. 1,030,000,000 + 890,000,000, which gave a correction factor 0.671875. 
We then multiplied this correction factor by the number of webpages containing Pet and Word found using 
Yahoo, which gave us 692,031,250. We consider this 'corrected number' to be a fair estimate of the number 
of webpages containing both words Pet and World. 



w {Pet- Fish, Hierarchy) < w {Pet, Hierarchy) 
w {Pet- Fish, Hierarchy) < w {Fish, Hierarchy) 



(1) 
(2) 



w{Pet, Guppy) < w{Pet-Fish, Guppy) 
w{Fish, Guppy) < w{Pet-Fish, Guppy) 



(3) 
(4) 
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4 The Guppy Effect and Meaning 



We gathered data on the World-Wide Web for several concepts with the aim of understanding the Guppy 
effect, and the results are listed in Table 1. Tot. N is the total number of webpages found in our searches 
conducted on May 4 using the Yahoo search engine. Rel. N is the relative number of webpages, e.g. in 
the first column it is the number of webpages containing Pet and one of the other words considered, viz. 
Guppy, World, Spelling, House, Goldfish and Hierarchy. In the second column. Rel. N is the number of 
webpages containing Fish and one of the other aforementioned words, while in the third column, it is the 
number of webpages containing Pet-Fish and one of the other words. Rel. -N is the relative number of 
webpages 'not' containing the word concerned. For example, in the region where Guppy is considered, in 
the first column, Rel. N is the number of webpages containing Pet and 'not' containing Guppy. In the 
second column of the region where Guppy is considered, it is the number of webpages containing Fish and 
'not' containing Guppy, and in the third column of this region, it is the number of webpages containing 
Pet-Fish and 'not' containing Guppy. An analogous approach applies to the other regions, considering the 
words World, Spelling, House, Goldfish and Hierarchy. Gorr. is the 'correction factor' for every region and 
for every column within each region. For example, for the first column, the number of webpages containing 
Pet divided by the sum of the number of webpages Rel. N and Rel -N of the region considered, Rel. N 
corr. is the corrected relative number of webpages, i.e. in each column of every region of a specific word 
it is the Rel N of this region multiplied by the correction factor of this region and the column concerned. 
Finally, Rel. w is the relative weight, which is the value that for each region corresponding to a specific 
word is given by the expression Rel. N corr. divided by the Tot. N. 

The relative weight is a number between and 1. For a given region and for the first column, it indicates 
the fraction of webpages containing Pet and the word corresponding to this region with respect to the total 
number of webpages containing Pet. For the second and third columns, it indicates these same fractions 
with respect to Fish and Pet-Fish. 

Before we interpret the results of the data we collected on the World-Wide Web, we should explain 
yet another aspect of our analysis. There is hardly any in-depth knowledge about 'how many webpages 
in all the World-Wide Web comprises at this moment'. Estimations of how large this total number may 
be vary because the outcome somehow depends on such factors as the manner of counting and the type 
of pages considered. For example, when we entered the word 'and' in the Yahoo search engine, we found 
34,900,000,000 webpages. When we entered the word 'the', it returned 36,400,000,000 hits, while the 
number '1' was found 49,000,000,000 times. Most probably, the latter search yielded the largest number 
of hits we were able to get in this way, since '1' also counts webpages in languages different from English, 
unlike the cases of 'the' and 'and'. The reason why we would like to know the total number of webpages is 
that this allows us to introduce a quantity with respect to any concept. We have called this quantity the 
'absolute weight' of this concept, i.e. the number of webpages containing the word expressing this concept 
divided by the totality of webpages comprising the World-Wide Web as a whole. Table 1 lists the results 
of our calculations of this quantity for the different concepts considered, assuming 55,000,000,000 to be the 
total number of webpages of the World-Wide Web indexed by Yahoo (Kunder 2010). 

We will now interpret our results. For each concept, the absolute weight provides a measure of its overall 
presence on the World-Wide Web. Guppy and Goldfish turn out to have the lowest absolute weights of all the 
concepts we considered, viz. 0.000234545 = 2.34545 • lO""^ and 0.000590909 = 5.90909 • W-\ respectively. 
They are followed by Spelling, with an absolute weight of 0.005290909 = 5.290909 • 10""^ , and by House and 
Hierarchy, with absolute weights of 0.088727273 = 0.88727273 • lO'^ and 0.088727273 = 0.88727273 • lO'^, 
respectively. The word World has an absolute weight of 0.201818182 = 2.01818182 • 10"^ For each of 
the three words Pet, Fish and Pet-Fish, the corresponding relative weights provide the means to measure 
their presence in 'meaning contexts'. Now, if the measure of the presence of a word B in the 'meaning 
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Table 1: Data collected on the World-Wide Web illustrating the Pet-Fish problem of concept research in psychology 





Pet 


Fish 


Pet-Fish 


Tot. N 


1.290,000.000 


1,100.000.000 


1,760.000 


Giippij 


Tot. N 


12,900,000 






Abs. w 


0.000234545 






Rel. N 


3,050,000 


4,520,000 


37,900 


Rel. -N 


1,290,000,000 


1,100,000,000 


1,710,000 


Corr. 


0.9976412 


0.9959077 


1.0069226 


Rel. N corr. 


3,042,806 


4,501,503 


38,162 


Rel. w 


0.0023588 


0.0040923 


0.0216832 


M 


10.0567 


17.4477 


92.4476 


World 


Tot. N 


11,100,000,000 






Abs. w 


0.201818182 






Rel. N 


1,030,000,000 


719,000,000 


737,000 


Rel. -N 


890,000,000 


633,000,000 


970,000 


Corr. 


0.671875 


0.8136095 


1.0310486 


Rel. N corr. 


692,031,250 


584,985,207 


759,882 


Rel. w 


0.5364583 


0.5318047 


0.4317516 


M 


2.6581 


2.6351 


2.1393 


Spelling 


Tot. N 


291,000,000 






Abs. w 


0.005290909 






Rel. N 


32,100,000 


29,000,000 


40,200 


Rel. -N 


1,280,000,000 


1,090,000,000 


124,000,000 


Corr. 


0.9831568 


0.9830206 


0.9998864 


Rel. N corr. 


31,559,332 


28,507,596 


40,195 


Rel. w 


0.0244646 


0.0259160 


0.0228383 


M 


i.()2:i!) 


i.89cS2 


l.:!l()5 


House 


Tot. N 


4,880,000,000 






Abs. w 


0.088727273 






Rel. N 


683,000,000 


316,000,000 


431,000 


Rel. -N 


1,020,000,000 


1,280,000,000 


1,500,000 


Corr. 


0.7574868 


0.6892231 


0.9114448 


Rel. N corr. 


517,363,476 


217,794,486 


392,832 


Rel. w 


0.4010570 


0.1979950 


0.2232004 


M 


4.5201 


2.2315 


2.5156 


Goldfish 


Tot. N 


32,500,000 






Abs. w 


0.000590909 






Rel. N 


9,790,000 


9,790,000 


225,000 


Rel. -N 


1,280,000,000 


1,280,000,000 


1,500,000 


Corr. 


1.0001628 


0.8528520 


1.0202899 


Rel. N corr: 


9,791,593 


8,349,421 


229,565 


Rel. w 


0.0075904 


0.0075904 


0.1304348 


M 


12.8453 


12.8453 


220.7358 


Hierarchy 


Tot. N 


79,200,000 






Abs. w 


0.00144 






Rel. N 


4,210,000 


6,550,000 


1,410 


Rel. -N 


1,290,000,000 


1,090,000,000 


1,760,000 


Corr. 


0.996747 


1.0031462 


0.9991995 


Rel. N corr. 


4,196,305 


6,570,607 


1,408 


Rel. w 


0.003253 


0.005973 


0.0008005 


M 


2.2590 


4.1481 


0.5559 
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context' of another word A equals the measure of the overall presence of this word B, this would indicate 
that there is no specific 'meaning bound' between A and B. Hence, if we divide the relative weight of word 
B with respect to A by the absolute weight of word B, the resulting number is capable of expressing in 
quantitive terms the meaning bound between the words A and B. This number will be equal to 1 if there 
is no meaning bound, i.e. no meaning bound stronger than the overall meaning bound related to the fact 
that words are members of the whole World-Wide Web. It will be larger than 1 if yl and B have - what we 
have called - an 'attractive meaning bound' and it will be smaller than 1 if A and B have what we have 
called a 'repulsive meaning bound'. We have calculated these quantities and called them M{A, B) for the 
words expressing the concepts which we considered with respect to the Pet-Fish problem. They are given 
in Table 1. Let us remark that 



where n(A) and n{B) are the number of webpages containing word A and word B, respectively, n{A, B) is 
the number of webpages containing word A and word and n(www) is the total number of webpages of 
the World-Wide Web. It is interesting to note that M(A, B) = M{B, A), and hence M{A, B) is symmetric 
in A and B^ which means that we can really speak of a 'meaning bound between word A and word B\ 

So we can see in Table 1 that the meaning bounds between Guppy and Pet, Fish and Pet-Fish are 10, 
17 and 92, respectively. These are all three strong attractive meaning bounds, but the meaning bound 
with respect to Pet-Fish is much bigger than the meaning bounds with respect to Pet and Fish. Of the 
words considered, only Goldfish gives rise to an even more strongly pronounced effect of attractive meaning 
bound, namely 12, 12 and 220, respectively, with respect to Pet, Fish and Pet-Fish. All of the meaning 
bounds we have calculated are attractive, except the one of Hierarchy with respect to Pet-Fish, which 
is repulsive. This means that the relative occurrence of the word Hierarchy in webpages containing the 
concept Pet-Fish is smaller than its overall occurrence on the overall World-Wide Web. 

To explain the Guppy effect on the World-Wide Web, wc want to put forward an approach based on this 
meaning bound. For a conjunction of concepts, there may be exemplars of the concepts that have stronger 
attractive meaning bounds with respect to the conjunction than their meaning bounds with respect to 
each of the individual concepts, and, if this is the case, the Guppy effect appears. If wc look again at 
the examples given in Table 1, we can see that different situations are possible. For the words World and 
Spelling, the meaning bound with respect to Pet Fish is less attractive than it is with respect to Pet and 
to Fish. And indeed, as we mentioned already, for the word Hierarchy, the meaning bound with respect to 
Pet-Fish is repulsive, while it is attractive with respect to Pet and Fish. For the word House, the meaning 
bound with respect to Pet-Fish is less attractive than with respect to Pet, but more attractive than with 
respect to Fish. 

5 Conclusion 

Although not completely equivalent, the Guppy effect we have identified on the World-Wide Web is strongly 
related to the one identified in concept research with respect to typicality (Osherson and Smith 1981) and 
membership weight (Hampton 1988a). We also are quite convinced that both effects have the same origin, 
and that this origin is in some sense revealed more clearly on the World-Wide Web than it is within 
the context of the original psychological experiments. The origin is that 'the extent to which exemplars 
are present in the meaning landscape surrounding concepts and their conjunctions' does not follow the 
inequalities which are classically supposed to be fulfilled if typicality and membership are looked upon 
from a fuzzy set theoretic perspective, as in the original analyses put forward by Osherson and Smith 
(1981), and by Hampton (1988a). This 'extent of presence in the meaning landscape surrounding the 



M{A,B) = 



n{A,B)/n{A) 
n{B) / n{www) 



n{A, B)n{www) 



n{B)n{A) 



(5) 



6 



concepts considered and their conjunctions' is a factor that is equally determining as those originating in 
typicality and membership. In other words, when typicality or membership are measured in psychological 
experiments, they are strongly influenced by the effect of 'the extent of presence in the meaning landscape 
surrounding the concepts considered and their conjunctions'. In our analysis of the Pet-Fish problem on the 
World-Wide Web, we directly calculated this 'extent of presence in the meaning landscape surrounding the 
concepts considered and their conjunctions'. In Aerts (2010e), we showed how the analysis of the Pet-Fish 
problem put forward in the present article is a direct consequence of the quantum-mechanical model for 
the measurement of meaning elaborated in that publication, and how the mathematical expression for the 
'meaning bound' we introduced here follows from the quantum-mechanical weight structure. 
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